查看原文
其他

一次蓝屏故障分析

CPP开发者 2021-07-20

(给CPP开发者加星标,提升C/C++技能)

来源:CSDN - xdesk

有同事在客户现场遇到了蓝屏问题,分析和很久没有结论,请求我帮忙分析一下DUMP文件,本文记录一下整个分析过程。

1. 背景

有电脑频繁出现蓝屏,蓝屏代码为DRIVER_POWER_STATE_FAILURE,例如如下:

DRIVER_POWER_STATE_FAILURE (9f)
A driver has failed to complete a power IRP within a specific time.
Arguments:
Arg1: 0000000000000004, The power transition timed out waiting to synchronize with the Pnp
 subsystem.
Arg2: 000000000000012c, Timeout in seconds.
Arg3: ffffb103caf02080, The thread currently holding on to the Pnp lock.
Arg4: ffffee8635e5f7e0, nt!TRIAGE_9F_PNP on Win7 and higher

从现场反馈,好像是每次在开机的时候出现蓝屏,接下来我们具体分析一下这个问题的原因。

2. 分析

首先,我们看蓝屏时候出现的堆栈信息:

Implicit thread is now ffff9b00`abfe9240
 # RetAddr           : Args to Child                                                           : Call Site
00 fffff806`3ca9f6f6 : 00000000`0000009f 00000000`00000004 00000000`0000012c ffffb103`caf02080 : nt!KeBugCheckEx
01 fffff806`3cdaf9a6 : ffffee86`35e5fa10 00000000`00000070 fffff806`38195800 00000000`00000001 : nt!PnpBugcheckPowerTimeout+0x76
02 fffff806`3c8c1d49 : ffffee86`36a27230 00000001`516253a0 00000001`00000002 ffffb103`dae7f000 : nt!PopBuildDeviceNotifyListWatchdog+0x16
03 fffff806`3c8c0aa9 : 00000000`000000100000000`00989680 00000000`00005458 00000000`00000070 : nt!KiProcessExpiredTimerList+0x169
04 fffff806`3c9c5ebe : ffffffff`00000000 ffff9b00`abfd8180 ffff9b00`abfe9240 ffffb103`dd515080 : nt!KiRetireDpcList+0x4e9
05 00000000`00000000 : ffffee86`35e60000 ffffee86`35e59000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x7e

从上面我们可以大致的猜测到是因为在处理PNP电源消息的时候超时了,然后被看门狗检测到了操作超时而引发的蓝屏,这里我们不去纠结看门狗这个机制,存在的问题肯定是超时导致的,我看看下PNP LOCK对应的栈信息:

THREAD ffffb103caf02080  Cid 0004.0120  Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
    ffffb103ddf6c5b0  NotificationEvent
IRP List:
    ffffb103dfde65e0: (0006,03e8) Flags: 00000000  Mdl: 00000000
Not impersonating
DeviceMap                 ffffc3875d014ba0
Owning Process            ffffb103c8cae040       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      17026          Ticks: 19200 (0:00:05:00.000)
Context Switch Count      8393           IdealProcessor: 7  NoStackSwap
UserTime                  00:00:00.000
KernelTime                00:00:00.156
Win32 Start Address nt!ExpWorkerThread (0xfffff8063c8f42b0)
Stack Init ffffee86360b8b90 Current ffffee86360b7da0
Base ffffee86360b9000 Limit ffffee86360b2000 Call 0000000000000000
Priority 15 BasePriority 12 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffee86`360b7de0 fffff806`3c91507d : ffff9b00`abfd8180 8010001f`fffffffe ffff9b00`ffffffff 00000000`00000001 : nt!KiSwapContext+0x76
ffffee86`360b7f20 fffff806`3c913f04 : ffffb103`caf02080 00000000`00000000 ffffb103`00000000 fffff806`00000000 : nt!KiSwapThread+0xbfd
ffffee86`360b7fc0 fffff806`3c9136a5 : ffffb103`c8cb3cb0 ffffb103`00000000 00000000`00000000 00000000`00000000 : nt!KiCommitThreadWait+0x144
ffffee86`360b8060 fffff806`407e9920 : ffffb103`ddf6c5b0 fffff806`00000000 ffffb103`db5ca000 fffff806`00000000 : nt!KeWaitForSingleObject+0x255
ffffee86`360b8140 fffff806`407dcb89 : ffffb103`ddf6c590 ffffee86`360b82d0 ffffb103`ddf6c590 fffff806`407cc8eb : ndis!KWaitEventBase<wistd::integral_constant<enum _EVENT_TYPE,0> >::Wait+0x28
ffffee86`360b8180 fffff806`4080bb01 : ffffb103`ddf6b1a0 ffffee86`360b82d0 ffffb103`ddf6c590 ffffb103`db5ca020 : ndis!Ndis::BindEngine::ApplyBindChanges+0x10915
ffffee86`360b81d0 fffff806`407750d4 : ffffb103`ddf6b1a0 ffffb103`ddf6b1a0 fffff806`407b5050 fffff806`407b5050 : ndis!ndisPnPRemoveDevice+0x2fd
ffffee86`360b8410 fffff806`407e919f : ffffb103`ddf6b1a0 ffffee86`360b85f0 ffffb103`ddf6b1a0 ffffb103`dfde6500 : ndis!ndisPnPRemoveDeviceEx+0x148
ffffee86`360b8460 fffff806`40774f31 : ffffb103`ddf6b1a0 ffffb103`ddf6b1a0 ffffee86`360b8630 ffffb103`dfde65e0 : ndis!ndisPnPIrpSurpriseRemovalInner+0x13f
ffffee86`360b8560 fffff806`407193c4 : ffffb103`dfde65e0 00000000`00000000 00000000`00000000 ffffb103`ddf6b1a0 : ndis!ndisPnPIrpSurpriseRemoval+0xed
ffffee86`360b85b0 fffff806`3c90a929 : ffffee86`360b8601 ffffb103`ddf6b050 00000000`00000001 ffffee86`360b8720 : ndis!ndisPnPDispatch+0x31354
ffffee86`360b8620 fffff806`3cdbafe4 : ffffee86`360b86c0 ffffb103`ddf6b050 ffffee86`360b8720 ffffb103`dfde65e0 : nt!IofCallDriver+0x59
ffffee86`360b8660 fffff806`3cf3123a : 00000000`00000017 ffffb103`ddf75af0 ffffb103`dbeb52a0 ffffb103`ddf75af0 : nt!IopSynchronousCall+0xf8
ffffee86`360b86e0 fffff806`3cf30df9 : ffffc387`6e7b4790 00000000`00000000 00000000`00000308 00000000`00000000 : nt!IopRemoveDevice+0x106
ffffee86`360b87a0 fffff806`3cf30bbb : ffffb103`ddf75af0 00000000`00000000 00000000`00000000 00000000`00000000 : nt!PnpSurpriseRemoveLockedDeviceNode+0xb5
ffffee86`360b8800 fffff806`3cf3088a : ffffb103`ddf75af0 ffffee86`360b8880 00000000`00000000 fffff806`3cf3064f : nt!PnpDeleteLockedDeviceNode+0x57
ffffee86`360b8840 fffff806`3cf2f087 : ffffb103`dbeb5960 00000008`00000002 00000000`00000008 00000000`00000000 : nt!PnpDeleteLockedDeviceNodes+0x76
ffffee86`360b88c0 fffff806`3cf0867e : ffffee86`360b8a10 ffffb103`dd9e6900 ffffee86`360b8a00 ffffc387`00000008 : nt!PnpProcessQueryRemoveAndEject+0x1ef
ffffee86`360b89b0 fffff806`3cdd6748 : ffffc387`6e7b4790 ffffc387`6a87d1e0 ffffc387`6a87d1e0 00000000`00000000 : nt!PnpProcessTargetDeviceEvent+0xea
ffffee86`360b89e0 fffff806`3c8f43b5 : ffffb103`c8cb3cb0 ffffb103`caf02080 ffffb103`c8cb3cb0 fffff806`3cc61340 : nt!PnpDeviceEventWorker+0x2d8
ffffee86`360b8a70 fffff806`3c86bcd5 : ffffb103`caf02080 00000000`00000080 ffffb103`c8cae040 000024ef`bd9bbfff : nt!ExpWorkerThread+0x105
ffffee86`360b8b10 fffff806`3c9c9998 : ffff9b00`ac524180 ffffb103`caf02080 fffff806`3c86bc80 00000000`00000000 : nt!PspSystemThreadStartup+0x55
ffffee86`360b8b60 00000000`00000000 : ffffee86`360b9000 ffffee86`360b2000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28

这里看到了system进程中存在一个线程等待NDIS的事件信号,并且从这个堆栈中我们大致发现这个线程应该占用了PNP锁,我们看下占用情况:

THREAD ffffb103da5bb080  Cid 03c8.03cc  Teb: 0000009f3443b000 Win32Thread: ffffb103da407810 WAIT: (WrResource) KernelMode Non-Alertable
    ffffee8636a270e0  SynchronizationEvent
Not impersonating
DeviceMap                 ffffc3875d014ba0
Owning Process            ffffb103da5ba080       Image:         wininit.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      36004          Ticks: 222 (0:00:00:03.468)
Context Switch Count      439            IdealProcessor: 2             
UserTime                  00:00:00.000
KernelTime                00:00:00.062
Win32 Start Address 0x00007ff7e0c836f0
Stack Init ffffee8636a27b90 Current ffffee8636a26c60
Base ffffee8636a28000 Limit ffffee8636a21000 Call 0000000000000000
Priority 15 BasePriority 15 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffee86`36a26ca0 fffff806`3c91507d : ffff9b00`abfd8180 ffffffd7`fffffffe ffffb103`ffffffff 00000000`00000001 : nt!KiSwapContext+0x76
ffffee86`36a26de0 fffff806`3c913f04 : ffffb103`da5bb080 ffffb103`00000000 ffffb103`00000000 ffffb103`00000000 : nt!KiSwapThread+0xbfd
ffffee86`36a26e80 fffff806`3c9136a5 : 00000000`0000006c fffff806`00000000 00000000`00000000 00000000`00000000 : nt!KiCommitThreadWait+0x144
ffffee86`36a26f20 fffff806`3c91691d : ffffee86`36a270e0 00000000`0000001b 00000000`00000000 ffffee86`36a27400 : nt!KeWaitForSingleObject+0x255
ffffee86`36a27000 fffff806`3c90fcd7 : fffff806`3cc629a0 ffffee86`36a270c8 00000000`00010224 fffff806`3c95a6c0 : nt!ExpWaitForResource+0x6d
ffffee86`36a27080 fffff806`3cdeaeac : ffffee86`36a271c0 00000000`00000000 ffffee86`36a271a0 00000000`00000001 : nt!ExAcquireResourceExclusiveLite+0x217
ffffee86`36a27110 fffff806`3c957e9c : 00000000`00000000 00000000`00000000 00000000`00000001 00000000`00000005 : nt!PpDevNodeLockTree+0x58
ffffee86`36a27140 fffff806`3cd9d386 : fffff806`3cb799e8 fffff806`3c86fc33 00000000`40580088 00000000`00000000 : nt!PnpLockDeviceActionQueue+0x10
ffffee86`36a27180 fffff806`3cd9d2fb : 00000000`00000000 00000000`00000000 ffffb103`dbcfe9b0 fffff806`3caee9fe : nt!IoBuildPoDeviceNotifyList+0x4a
ffffee86`36a271e0 fffff806`3cf25d00 : 00000000`00000000 ffffb103`dbcfe980 00000000`00000000 00000000`00000000 : nt!PopBuildDeviceNotifyList+0xb7
ffffee86`36a272c0 fffff806`3cd9a79f : ffffee86`36a273d0 ffffee86`36a27448 ffffee86`36a273d0 000001ab`ff7c0000 : nt!PoInitializeBroadcast+0xc8
ffffee86`36a272f0 fffff806`3cd9f19c : ffffb103`c8ee8000 fffff806`00000006 00000000`00000005 00000000`00989680 : nt!PopTransitionSystemPowerStateEx+0x233
ffffee86`36a273b0 fffff806`3c9d3c15 : ffffb103`00000000 00000000`00000000 00000000`00000000 ffffb103`c8ee2000 : nt!NtSetSystemPowerState+0x4c
ffffee86`36a27590 fffff806`3c9c61b0 : fffff806`3cd9a6ba ffffee86`36a27810 ffffee86`36a277b0 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffee86`36a27590)
ffffee86`36a27728 fffff806`3cd9a6ba : ffffee86`36a27810 ffffee86`36a277b0 00000000`00000000 00000000`00000000 : nt!KiServiceLinkage
ffffee86`36a27730 fffff806`3cd9f19c : ffffd8a4`d1924af4 fffff806`00000006 00000000`00000005 00000000`00000000 : nt!PopTransitionSystemPowerStateEx+0x14e
ffffee86`36a277f0 fffff806`3d10d069 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!NtSetSystemPowerState+0x4c
ffffee86`36a279d0 fffff806`3c9d3c15 : ffffb103`da5bb080 ffffee86`00000001 ffffee86`36a27a80 ffffee86`36a27a80 : nt!NtShutdownSystem+0x39
ffffee86`36a27a00 00007ff9`fccdf624 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffee86`36a27a00)
0000009f`3432f538 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ff9`fccdf624

确实仿佛是wininit在关机的时候(跟客户反馈的开机似乎有点不一样,但是这里并不影响分析过程),请求占用设备栈的锁,然后陷入了等到状态,等待的锁被上面的线程栈调用给占用了。

我们继续分析这个线程

Priority 15 BasePriority 12 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffee86`360b7de0 fffff806`3c91507d : ffff9b00`abfd8180 8010001f`fffffffe ffff9b00`ffffffff 00000000`00000001 : nt!KiSwapContext+0x76
ffffee86`360b7f20 fffff806`3c913f04 : ffffb103`caf02080 00000000`00000000 ffffb103`00000000 fffff806`00000000 : nt!KiSwapThread+0xbfd
ffffee86`360b7fc0 fffff806`3c9136a5 : ffffb103`c8cb3cb0 ffffb103`00000000 00000000`00000000 00000000`00000000 : nt!KiCommitThreadWait+0x144
ffffee86`360b8060 fffff806`407e9920 : ffffb103`ddf6c5b0 fffff806`00000000 ffffb103`db5ca000 fffff806`00000000 : nt!KeWaitForSingleObject+0x255
ffffee86`360b8140 fffff806`407dcb89 : ffffb103`ddf6c590 ffffee86`360b82d0 ffffb103`ddf6c590 fffff806`407cc8eb : ndis!KWaitEventBase<wistd::integral_constant<enum _EVENT_TYPE,0> >::Wait+0x28
ffffee86`360b8180 fffff806`4080bb01 : ffffb103`ddf6b1a0 ffffee86`360b82d0 ffffb103`ddf6c590 ffffb103`db5ca020 : ndis!Ndis::BindEngine::ApplyBindChanges+0x10915
ffffee86`360b81d0 fffff806`407750d4 : ffffb103`ddf6b1a0 ffffb103`ddf6b1a0 fffff806`407b5050 fffff806`407b5050 : ndis!ndisPnPRemoveDevice+0x2fd
ffffee86`360b8410 fffff806`407e919f : ffffb103`ddf6b1a0 ffffee86`360b85f0 ffffb103`ddf6b1a0 ffffb103`dfde6500 : ndis!ndisPnPRemoveDeviceEx+0x148
ffffee86`360b8460 fffff806`40774f31 : ffffb103`ddf6b1a0 ffffb103`ddf6b1a0 ffffee86`360b8630 ffffb103`dfde65e0 : ndis!ndisPnPIrpSurpriseRemovalInner+0x13f
ffffee86`360b8560 fffff806`407193c4 : ffffb103`dfde65e0 00000000`00000000 00000000`00000000 ffffb103`ddf6b1a0 : ndis!ndisPnPIrpSurpriseRemoval+0xed
ffffee86`360b85b0 fffff806`3c90a929 : ffffee86`360b8601 ffffb103`ddf6b050 00000000`00000001 ffffee86`360b8720 : ndis!ndisPnPDispatch+0x31354
ffffee86`360b8620 fffff806`3cdbafe4 : ffffee86`360b86c0 ffffb103`ddf6b050 ffffee86`360b8720 ffffb103`dfde65e0 : nt!IofCallDriver+0x59
ffffee86`360b8660 fffff806`3cf3123a : 00000000`00000017 ffffb103`ddf75af0 ffffb103`dbeb52a0 ffffb103`ddf75af0 : nt!IopSynchronousCall+0xf8
ffffee86`360b86e0 fffff806`3cf30df9 : ffffc387`6e7b4790 00000000`00000000 00000000`00000308 00000000`00000000 : nt!IopRemoveDevice+0x106
ffffee86`360b87a0 fffff806`3cf30bbb : ffffb103`ddf75af0 00000000`00000000 00000000`00000000 00000000`00000000 : nt!PnpSurpriseRemoveLockedDeviceNode+0xb5
ffffee86`360b8800 fffff806`3cf3088a : ffffb103`ddf75af0 ffffee86`360b8880 00000000`00000000 fffff806`3cf3064f : nt!PnpDeleteLockedDeviceNode+0x57
ffffee86`360b8840 fffff806`3cf2f087 : ffffb103`dbeb5960 00000008`00000002 00000000`00000008 00000000`00000000 : nt!PnpDeleteLockedDeviceNodes+0x76
ffffee86`360b88c0 fffff806`3cf0867e : ffffee86`360b8a10 ffffb103`dd9e6900 ffffee86`360b8a00 ffffc387`00000008 : nt!PnpProcessQueryRemoveAndEject+0x1ef
ffffee86`360b89b0 fffff806`3cdd6748 : ffffc387`6e7b4790 ffffc387`6a87d1e0 ffffc387`6a87d1e0 00000000`00000000 : nt!PnpProcessTargetDeviceEvent+0xea
ffffee86`360b89e0 fffff806`3c8f43b5 : ffffb103`c8cb3cb0 ffffb103`caf02080 ffffb103`c8cb3cb0 fffff806`3cc61340 : nt!PnpDeviceEventWorker+0x2d8
ffffee86`360b8a70 fffff806`3c86bcd5 : ffffb103`caf02080 00000000`00000080 ffffb103`c8cae040 000024ef`bd9bbfff : nt!ExpWorkerThread+0x105
ffffee86`360b8b10 fffff806`3c9c9998 : ffff9b00`ac524180 ffffb103`caf02080 fffff806`3c86bc80 00000000`00000000 : nt!PspSystemThreadStartup+0x55
ffffee86`360b8b60 00000000`00000000 : ffffee86`360b9000 ffffee86`360b2000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28

从中,我们找到等待的事件

4: kd> dt nt!_KEVENT ffffb103`ddf6c5b0
   +0x000 Header           : _DISPATCHER_HEADER
4: kd> dx -id 0,0,ffffb103c8cae040 -r1 (*((ntkrnlmp!_DISPATCHER_HEADER *)0xffffb103ddf6c5b0))
(*((ntkrnlmp!_DISPATCHER_HEADER *)0xffffb103ddf6c5b0))                 [Type: _DISPATCHER_HEADER]
    [+0x000] Lock             : 393216 [Type: long]
    [+0x000] LockNV           : 393216 [Type: long]
    [+0x000] Type             : 0x0 [Type: unsigned char]
    [+0x001] Signalling       : 0x0 [Type: unsigned char]
    [+0x002] Size             : 0x6 [Type: unsigned char]
    [+0x003] Reserved1        : 0x0 [Type: unsigned char]
    [+0x000] TimerType        : 0x0 [Type: unsigned char]
    [+0x001] TimerControlFlags : 0x0 [Type: unsigned char]
    [+0x001 ( 00)] Absolute         : 0x0 [Type: unsigned char]
    [+0x001 ( 11)] Wake             : 0x0 [Type: unsigned char]
    [+0x001 ( 72)] EncodedTolerableDelay : 0x0 [Type: unsigned char]
    [+0x002] Hand             : 0x6 [Type: unsigned char]
    [+0x003] TimerMiscFlags   : 0x0 [Type: unsigned char]
    [+0x003 ( 50)] Index            : 0x0 [Type: unsigned char]
    [+0x003 ( 66)] Inserted         : 0x0 [Type: unsigned char]
    [+0x003 ( 77)] Expired          : 0x0 [Type: unsigned char]
    [+0x000] Timer2Type       : 0x0 [Type: unsigned char]
    [+0x001] Timer2Flags      : 0x0 [Type: unsigned char]
    [+0x001 ( 00)] Timer2Inserted   : 0x0 [Type: unsigned char]
    [+0x001 ( 11)] Timer2Expiring   : 0x0 [Type: unsigned char]
    [+0x001 ( 22)] Timer2CancelPending : 0x0 [Type: unsigned char]
    [+0x001 ( 33)] Timer2SetPending : 0x0 [Type: unsigned char]
    [+0x001 ( 44)] Timer2Running    : 0x0 [Type: unsigned char]
    [+0x001 ( 55)] Timer2Disabled   : 0x0 [Type: unsigned char]
    [+0x001 ( 76)] Timer2ReservedFlags : 0x0 [Type: unsigned char]
    [+0x002] Timer2ComponentId : 0x6 [Type: unsigned char]
    [+0x003] Timer2RelativeId : 0x0 [Type: unsigned char]
    [+0x000] QueueType        : 0x0 [Type: unsigned char]
    [+0x001] QueueControlFlags : 0x0 [Type: unsigned char]
    [+0x001 ( 00)] Abandoned        : 0x0 [Type: unsigned char]
    [+0x001 ( 11)] DisableIncrement : 0x0 [Type: unsigned char]
    [+0x001 ( 72)] QueueReservedControlFlags : 0x0 [Type: unsigned char]
    [+0x002] QueueSize        : 0x6 [Type: unsigned char]
    [+0x003] QueueReserved    : 0x0 [Type: unsigned char]
    [+0x000] ThreadType       : 0x0 [Type: unsigned char]
    [+0x001] ThreadReserved   : 0x0 [Type: unsigned char]
    [+0x002] ThreadControlFlags : 0x6 [Type: unsigned char]
    [+0x002 ( 00)] CycleProfiling   : 0x0 [Type: unsigned char]
    [+0x002 ( 11)] CounterProfiling : 0x1 [Type: unsigned char]
    [+0x002 ( 22)] GroupScheduling  : 0x1 [Type: unsigned char]
    [+0x002 ( 33)] AffinitySet      : 0x0 [Type: unsigned char]
    [+0x002 ( 44)] Tagged           : 0x0 [Type: unsigned char]
    [+0x002 ( 55)] EnergyProfiling  : 0x0 [Type: unsigned char]
    [+0x002 ( 66)] SchedulerAssist  : 0x0 [Type: unsigned char]
    [+0x002 ( 77)] ThreadReservedControlFlags : 0x0 [Type: unsigned char]
    [+0x003] DebugActive      : 0x0 [Type: unsigned char]
    [+0x003 ( 00)] ActiveDR7        : 0x0 [Type: unsigned char]
    [+0x003 ( 11)] Instrumented     : 0x0 [Type: unsigned char]
    [+0x003 ( 22)] Minimal          : 0x0 [Type: unsigned char]
    [+0x003 ( 53)] Reserved4        : 0x0 [Type: unsigned char]
    [+0x003 ( 66)] UmsScheduled     : 0x0 [Type: unsigned char]
    [+0x003 ( 77)] UmsPrimary       : 0x0 [Type: unsigned char]
    [+0x000] MutantType       : 0x0 [Type: unsigned char]
    [+0x001] MutantSize       : 0x0 [Type: unsigned char]
    [+0x002] DpcActive        : 0x6 [Type: unsigned char]
    [+0x003] MutantReserved   : 0x0 [Type: unsigned char]
    [+0x004] SignalState      : 0 [Type: long]
    [+0x008] WaitListHead     [Type: _LIST_ENTRY]
4: kd> dx -r1 (*((ntkrnlmp!_LIST_ENTRY *)0xffffb103ddf6c5b8))
(*((ntkrnlmp!_LIST_ENTRY *)0xffffb103ddf6c5b8))                 [Type: _LIST_ENTRY]
    [+0x000] Flink            : 0xffffb103de5901c0 [Type: _LIST_ENTRY *]
    [+0x008] Blink            : 0xffffb103caf021c0 [Type: _LIST_ENTRY *]

但是我们从这个事件已经无法发现很多有用的东西了。不过我们从WaitListHead [Type: _LIST_ENTRY]我们可以知道,到底我多少个线程在等待这个事件,遍历所有线程链表如下:

(*((ntkrnlmp!_LIST_ENTRY *)0xffffb103ddf6c5b8)) [Type: _LIST_ENTRY]
[+0x000] Flink : 0xffffb103de5901c0 [Type: _LIST_ENTRY *]
[+0x008] Blink : 0xffffb103caf021c0 [Type: _LIST_ENTRY *]
4: kd> dx -r1 ((ntkrnlmp!_LIST_ENTRY *)0xffffb103de5901c0)
((ntkrnlmp!_LIST_ENTRY *)0xffffb103de5901c0) : 0xffffb103de5901c0 [Type: _LIST_ENTRY *]
[+0x000] Flink : 0xffffb103caf021c0 [Type: _LIST_ENTRY *]
[+0x008] Blink : 0xffffb103ddf6c5b8 [Type: _LIST_ENTRY *]
4: kd> dx -r1 ((ntkrnlmp!_LIST_ENTRY *)0xffffb103caf021c0)
((ntkrnlmp!_LIST_ENTRY *)0xffffb103caf021c0) : 0xffffb103caf021c0 [Type: _LIST_ENTRY *]
[+0x000] Flink : 0xffffb103ddf6c5b8 [Type: _LIST_ENTRY *]
[+0x008] Blink : 0xffffb103de5901c0 [Type: _LIST_ENTRY *]
4: kd> dx -r1 ((ntkrnlmp!_LIST_ENTRY *)0xffffb103ddf6c5b8)
((ntkrnlmp!_LIST_ENTRY *)0xffffb103ddf6c5b8) : 0xffffb103ddf6c5b8 [Type: _LIST_ENTRY *]
[+0x000] Flink : 0xffffb103de5901c0 [Type: _LIST_ENTRY *]
[+0x008] Blink : 0xffffb103caf021c0 [Type: _LIST_ENTRY *]

这里看到了一个可疑的线程:

4: kd> !thread 0xffffb103de5901c0-140
THREAD ffffb103de590080  Cid 1bd4.1bd8  Teb: 0000000000c5b000 Win32Thread: ffffb103ddf355c0 WAIT: (Executive) KernelMode Non-Alertable
    ffffb103ddf6c5b0  NotificationEvent
IRP List:
    ffffb103dd9f2960: (0006,0118) Flags: 00000884  Mdl: 00000000
Not impersonating
DeviceMap                 ffffc3875d014ba0
Owning Process            ffffb103de678480       Image:        xxxxxSvc.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      15925          Ticks: 20301 (0:00:05:17.203)
Context Switch Count      284            IdealProcessor: 2             
UserTime                  00:00:00.015
KernelTime                00:00:00.078
Win32 Start Address 0x000000000002ec08
Stack Init ffffee863a4b7b90 Current ffffee863a4b66a0
Base ffffee863a4b8000 Limit ffffee863a4b1000 Call 0000000000000000
Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffee86`3a4b66e0 fffff806`3c91507d : ffff9b00`00000000 00000000`00000000 ffff9b00`ffffffff 00000000`00000001 : nt!KiSwapContext+0x76
ffffee86`3a4b6820 fffff806`3c913f04 : ffffb103`de590080 00000000`00000000 ffffee86`3a4b7301 fffff806`00000000 : nt!KiSwapThread+0xbfd
ffffee86`3a4b68c0 fffff806`3c9136a5 : 00000000`00000004 ffffb103`00000000 00000000`00000000 00000000`00000000 : nt!KiCommitThreadWait+0x144
ffffee86`3a4b6960 fffff806`407e9920 : ffffb103`ddf6c5b0 fffff806`00000000 ffffee86`3a4b7300 fffff806`00000000 : nt!KeWaitForSingleObject+0x255
ffffee86`3a4b6a40 fffff806`407dcb89 : 00000000`00000000 ffffee86`3a4b6bd0 ffffb103`ddf6c590 fffff806`407cc8eb : ndis!KWaitEventBase<wistd::integral_constant<enum _EVENT_TYPE,0> >::Wait+0x28
ffffee86`3a4b6a80 fffff806`40763b1a : ffffb103`ddf6b1a0 ffffee86`3a4b6bd0 00000000`00000000 ffffee86`3a4b6b48 : ndis!Ndis::BindEngine::ApplyBindChanges+0x10915
ffffee86`3a4b6ad0 fffff806`407edcc4 : ffffb103`dbfe4000 00000000`00000001 ffffb103`c87dc008 ffffb103`dbfe4010 : ndis!ndisOpenAdapterLegacyProtocol+0x262
ffffee86`3a4b6c80 fffff806`407dd5b6 : ffffc387`66892480 fffff806`00000000 ffffb103`ddf6b1a0 00000000`00000000 : ndis!ndisBindLegacyProtocol+0x2c8
ffffee86`3a4b6dd0 fffff806`407d195e : 00000000`00000000 ffffee86`0000001d ffffee86`3a4b6f40 00000000`00000000 : ndis!ndisRestartProtocol+0xb742
ffffee86`3a4b6e40 fffff806`407d13c0 : ffffb103`ddf6b1a0 ffffb103`ddf6b1a0 ffffb103`ddf6c608 ffffb103`ddf6c590 : ndis!Ndis::BindEngine::Iterate+0x4f6
ffffee86`3a4b6fc0 fffff806`407cc409 : ffffb103`ddf6c590 ffffee86`3a4b7100 00000000`00000000 00000000`00000000 : ndis!Ndis::BindEngine::UpdateBindings+0x98
ffffee86`3a4b7010 fffff806`407cc2c8 : ffffb103`ddf6c590 00000000`00000000 ffffb103`ddf6c590 fffff806`407cc8eb : ndis!Ndis::BindEngine::DispatchPendingWork+0x75
ffffee86`3a4b7040 fffff806`40763b1a : ffffb103`ddf6b1a0 ffffee86`3a4b7190 00000000`00000000 ffffee86`3a4b7108 : ndis!Ndis::BindEngine::ApplyBindChanges+0x54
ffffee86`3a4b7090 fffff806`40809a9c : ffffb103`dd9f2a00 ffffee86`3a4b7301 ffffb103`c87dc008 ffffb103`c87dc000 : ndis!ndisOpenAdapterLegacyProtocol+0x262
ffffee86`3a4b7240 fffff806`53d72edd : ffffb103`c87dc000 ffffb103`c87dc028 ffffb103`dd9f2960 ffffb103`c87ddb68 : ndis!NdisOpenAdapter+0x4c
ffffee86`3a4b72b0 fffff806`3c90a929 : ffffb103`00000000 ffffb103`cad9ee50 00000000`00000000 ffffb103`dd9f2a30 : XXXXXcap+0x2edd
ffffee86`3a4b7360 fffff806`3c9099e4 : 00000000`00000000 00000000`00000000 ffffb103`dd9f2a78 fffff806`3c90a1a3 : nt!IofCallDriver+0x59
ffffee86`3a4b73a0 fffff806`3ceaf86b : ffffee86`3a4b7660 fffff806`3ceaf225 ffffee86`3a4b75d0 ffffb103`dbb7a520 : nt!IoCallDriverWithTracing+0x34
ffffee86`3a4b73f0 fffff806`3ceb681f : ffffb103`cad9ed00 ffffb103`cad9ec25 ffffb103`dd9d4010 ffffc387`5d04a801 : nt!IopParseDevice+0x62b
ffffee86`3a4b7560 fffff806`3ceb4c81 : ffffb103`dd9d4000 ffffee86`3a4b77a8 ffffc387`00000040 ffffb103`c8cf94e0 : nt!ObpLookupObjectName+0x78f
ffffee86`3a4b7720 fffff806`3ce64f50 : ffffee86`00000001 00000000`00e3e850 00000000`00000001 00000000`00000000 : nt!ObOpenObjectByNameEx+0x201
ffffee86`3a4b7860 fffff806`3ce64719 : 00000000`00e3df80 00000000`c0100080 00000000`00e3e850 00000000`00e3df98 : nt!IopCreateFile+0x820
ffffee86`3a4b7900 fffff806`3c9d3c15 : ffffb103`de590080 ffffee86`3a4b7a80 ffffee86`3a4b79a8 00000000`00c5b000 : nt!NtCreateFile+0x79
ffffee86`3a4b7990 00007ff9`fccdcb14 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffee86`3a4b7a00)
00000000`00e3df08 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ff9`fccdcb14

这个线程虽然比较可疑,但是似乎他并没有占用什么资源导致等待,先认定他是可疑的点吧。

我们从事件的源头出发,查看导致等待的代码

在这里插入图片描述

确实存在等待一个事件的操作,验证了我们上面的分析过程,我们看下这个事件在哪些地方可以设置,通过查找我们发现了两个地方

还有就是void __fastcall Ndis::BindEngine::UpdateBindings(Ndis::BindEngine *this, struct KLockThisExclusive *a2)这个函数


明显第一个是没有什么可以分析的了,因为第一个是构造函数,构造函数肯定调用的,并且调用完成构造函数之后,事件是有信号的,那么这里我们可以得到一个分析的点,那么谁把这个事件状态clear掉了呢?

接着分析我们发现


就在void __fastcall Ndis::BindEngine::UpdateBindings(Ndis::BindEngine *this, struct KLockThisExclusive *a2)这个函数的上层调用了Clear操作,等等到了这里我们看一下那个可疑的调用操作

4: kd> !thread 0xffffb103de5901c0-140
THREAD ffffb103de590080  Cid 1bd4.1bd8  Teb: 0000000000c5b000 Win32Thread: ffffb103ddf355c0 WAIT: (Executive) KernelMode Non-Alertable
    ffffb103ddf6c5b0  NotificationEvent
IRP List:
    ffffb103dd9f2960: (0006,0118) Flags: 00000884  Mdl: 00000000
Not impersonating
DeviceMap                 ffffc3875d014ba0
Owning Process            ffffb103de678480       Image:        xxxxxSvc.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      15925          Ticks: 20301 (0:00:05:17.203)
Context Switch Count      284            IdealProcessor: 2             
UserTime                  00:00:00.015
KernelTime                00:00:00.078
Win32 Start Address 0x000000000002ec08
Stack Init ffffee863a4b7b90 Current ffffee863a4b66a0
Base ffffee863a4b8000 Limit ffffee863a4b1000 Call 0000000000000000
Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffee86`3a4b66e0 fffff806`3c91507d : ffff9b00`00000000 00000000`00000000 ffff9b00`ffffffff 00000000`00000001 : nt!KiSwapContext+0x76
ffffee86`3a4b6820 fffff806`3c913f04 : ffffb103`de590080 00000000`00000000 ffffee86`3a4b7301 fffff806`00000000 : nt!KiSwapThread+0xbfd
ffffee86`3a4b68c0 fffff806`3c9136a5 : 00000000`00000004 ffffb103`00000000 00000000`00000000 00000000`00000000 : nt!KiCommitThreadWait+0x144
ffffee86`3a4b6960 fffff806`407e9920 : ffffb103`ddf6c5b0 fffff806`00000000 ffffee86`3a4b7300 fffff806`00000000 : nt!KeWaitForSingleObject+0x255
ffffee86`3a4b6a40 fffff806`407dcb89 : 00000000`00000000 ffffee86`3a4b6bd0 ffffb103`ddf6c590 fffff806`407cc8eb : ndis!KWaitEventBase<wistd::integral_constant<enum _EVENT_TYPE,0> >::Wait+0x28
ffffee86`3a4b6a80 fffff806`40763b1a : ffffb103`ddf6b1a0 ffffee86`3a4b6bd0 00000000`00000000 ffffee86`3a4b6b48 : ndis!Ndis::BindEngine::ApplyBindChanges+0x10915
ffffee86`3a4b6ad0 fffff806`407edcc4 : ffffb103`dbfe4000 00000000`00000001 ffffb103`c87dc008 ffffb103`dbfe4010 : ndis!ndisOpenAdapterLegacyProtocol+0x262
ffffee86`3a4b6c80 fffff806`407dd5b6 : ffffc387`66892480 fffff806`00000000 ffffb103`ddf6b1a0 00000000`00000000 : ndis!ndisBindLegacyProtocol+0x2c8
ffffee86`3a4b6dd0 fffff806`407d195e : 00000000`00000000 ffffee86`0000001d ffffee86`3a4b6f40 00000000`00000000 : ndis!ndisRestartProtocol+0xb742
ffffee86`3a4b6e40 fffff806`407d13c0 : ffffb103`ddf6b1a0 ffffb103`ddf6b1a0 ffffb103`ddf6c608 ffffb103`ddf6c590 : ndis!Ndis::BindEngine::Iterate+0x4f6
ffffee86`3a4b6fc0 fffff806`407cc409 : ffffb103`ddf6c590 ffffee86`3a4b7100 00000000`00000000 00000000`00000000 : ndis!Ndis::BindEngine::UpdateBindings+0x98
ffffee86`3a4b7010 fffff806`407cc2c8 : ffffb103`ddf6c590 00000000`00000000 ffffb103`ddf6c590 fffff806`407cc8eb : ndis!Ndis::BindEngine::DispatchPendingWork+0x75
ffffee86`3a4b7040 fffff806`40763b1a : ffffb103`ddf6b1a0 ffffee86`3a4b7190 00000000`00000000 ffffee86`3a4b7108 : ndis!Ndis::BindEngine::ApplyBindChanges+0x54
ffffee86`3a4b7090 fffff806`40809a9c : ffffb103`dd9f2a00 ffffee86`3a4b7301 ffffb103`c87dc008 ffffb103`c87dc000 : ndis!ndisOpenAdapterLegacyProtocol+0x262
ffffee86`3a4b7240 fffff806`53d72edd : ffffb103`c87dc000 ffffb103`c87dc028 ffffb103`dd9f2960 ffffb103`c87ddb68 : ndis!NdisOpenAdapter+0x4c
ffffee86`3a4b72b0 fffff806`3c90a929 : ffffb103`00000000 ffffb103`cad9ee50 00000000`00000000 ffffb103`dd9f2a30 : XXXXXcap+0x2edd
ffffee86`3a4b7360 fffff806`3c9099e4 : 00000000`00000000 00000000`00000000 ffffb103`dd9f2a78 fffff806`3c90a1a3 : nt!IofCallDriver+0x59
ffffee86`3a4b73a0 fffff806`3ceaf86b : ffffee86`3a4b7660 fffff806`3ceaf225 ffffee86`3a4b75d0 ffffb103`dbb7a520 : nt!IoCallDriverWithTracing+0x34
ffffee86`3a4b73f0 fffff806`3ceb681f : ffffb103`cad9ed00 ffffb103`cad9ec25 ffffb103`dd9d4010 ffffc387`5d04a801 : nt!IopParseDevice+0x62b
ffffee86`3a4b7560 fffff806`3ceb4c81 : ffffb103`dd9d4000 ffffee86`3a4b77a8 ffffc387`00000040 ffffb103`c8cf94e0 : nt!ObpLookupObjectName+0x78f
ffffee86`3a4b7720 fffff806`3ce64f50 : ffffee86`00000001 00000000`00e3e850 00000000`00000001 00000000`00000000 : nt!ObOpenObjectByNameEx+0x201
ffffee86`3a4b7860 fffff806`3ce64719 : 00000000`00e3df80 00000000`c0100080 00000000`00e3e850 00000000`00e3df98 : nt!IopCreateFile+0x820
ffffee86`3a4b7900 fffff806`3c9d3c15 : ffffb103`de590080 ffffee86`3a4b7a80 ffffee86`3a4b79a8 00000000`00c5b000 : nt!NtCreateFile+0x79
ffffee86`3a4b7990 00007ff9`fccdcb14 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffee86`3a4b7a00)
00000000`00e3df08 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ff9`fccdcb14

刚好在这个调用过程中了,那么看来这是这个调用无疑了,我们看下这个IRP操作:


4: kd> !irp ffffb103dd9f2960
Irp is active with 1 stacks 1 is current (= 0xffffb103dd9f2a30)
 No Mdl: No System Buffer: Thread ffffb103de590080:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
>[IRP_MJ_CREATE(0), N/A(0)]
            0  0 ffffb103cad9ed00 ffffb103de1e4a60 00000000-00000000    
        \Driver\XXXXcap
   Args: ffffee863a4b74f0 01000060 00000000 00000000

那么大致是NdisOpenAdapter调用的时候出现了错误了,本人对于Ndis驱动没有过多深入的研究,也不知道这个调用是否真会造成问题。

但是我们大致可以得到结论,卡死蓝屏应该就是这个驱动导致的。

3. 验证

让同事将这个驱动重命名,测试蓝屏不再重现,因此基本得出结论,就是这个驱动错误导致蓝屏。


- EOF -


推荐阅读  点击标题可跳转

1、多线程队列的算法优化

2、无锁队列的实现

3、如何让new操作符只构造,不申请内存


关注『CPP开发者』

看精选C++技术文章 . 加C++开发者专属圈子

↓↓↓


点赞和在看就是最大的支持❤️

    您可能也对以下帖子感兴趣

    文章有问题?点此查看未经处理的缓存